1,967 research outputs found

    Detection of Gene Expression in an Individual Cell Type within a Cell Mixture Using Microarray Analysis

    Get PDF
    BACKGROUND: A central issue in the design of microarray-based analysis of global gene expression is the choice between using cells of single type and a mixture of cells. This study quantified the proportion of lipopolysaccharide (LPS) induced differentially expressed monocyte genes that could be measured in peripheral blood mononuclear cells (PBMC), and determined the extent to which gene expression in the non-monocyte cell fraction diluted or obscured fold changes that could be detected in the cell mixture. METHODOLOGY/PRINCIPAL FINDINGS: Human PBMC were stimulated with LPS, and monocytes were then isolated by positive (Mono+) or negative (Mono-) selection. The non-monocyte cell fraction (MonoD) remaining after positive selection of monocytes was used to determine the effect of non-monocyte cells on overall expression. RNA from LPS-stimulated PBMC, Mono+, Mono- and MonoD samples was co-hybridised with unstimulated RNA for each cell type on oligonucleotide microarrays. There was a positive correlation in gene expression between PBMC and both Mono+ (0.77) and Mono- (0.61-0.67) samples. Analysis of individual genes that were differentially expressed in Mono+ and Mono- samples showed that the ability to detect expression of some genes was similar when analysing PBMC, but for others, differential expression was either not detected or changed in the opposite direction. As a result of the dilutional or obscuring effect of gene expression in non-monocyte cells, overall about half of the statistically significant LPS-induced changes in gene expression in monocytes were not detected in PBMC. However, 97% of genes with a four fold or greater change in expression in monocytes after LPS stimulation, and almost all (96-100%) of the top 100 most differentially expressed monocyte genes were detected in PBMC. CONCLUSIONS/SIGNIFICANCE: The effect of non-responding cells in a mixture dilutes or obscures the detection of subtle changes in gene expression in an individual cell type. However, for studies in which only the most highly differentially expressed genes are of interest, separating and analysing individual cell types may be unnecessary

    Differential expression analysis for sequence count data

    Get PDF
    *Motivation:* High-throughput nucleotide sequencing provides quantitative readouts in assays for RNA expression (RNA-Seq), protein-DNA binding (ChIP-Seq) or cell counting (barcode sequencing). Statistical inference of differential signal in such data requires estimation of their variability throughout the dynamic range. When the number of replicates is small, error modelling is needed to achieve statistical power.

*Results:* We propose an error model that uses the negative binomial distribution, with variance and mean linked by local regression, to model the null distribution of the count data. The method controls type-I error and provides good detection power. 

*Availability:* A free open-source R software package, _DESeq_, is available from the Bioconductor project and from "http://www-huber.embl.de/users/anders/DESeq":http://www-huber.embl.de/users/anders/DESeq

    Technical Variability Is Greater than Biological Variability in a Microarray Experiment but Both Are Outweighed by Changes Induced by Stimulation

    Get PDF
    INTRODUCTION: A central issue in the design of microarray-based analysis of global gene expression is that variability resulting from experimental processes may obscure changes resulting from the effect being investigated. This study quantified the variability in gene expression at each level of a typical in vitro stimulation experiment using human peripheral blood mononuclear cells (PBMC). The primary objective was to determine the magnitude of biological and technical variability relative to the effect being investigated, namely gene expression changes resulting from stimulation with lipopolysaccharide (LPS). METHODS AND RESULTS: Human PBMC were stimulated in vitro with LPS, with replication at 5 levels: 5 subjects each on 2 separate days with technical replication of LPS stimulation, amplification and hybridisation. RNA from samples stimulated with LPS and unstimulated samples were hybridised against common reference RNA on oligonucleotide microarrays. There was a closer correlation in gene expression between replicate hybridisations (0.86-0.93) than between different subjects (0.66-0.78). Deconstruction of the variability at each level of the experimental process showed that technical variability (standard deviation (SD) 0.16) was greater than biological variability (SD 0.06), although both were low (SD<0.1 for all individual components). There was variability in gene expression both at baseline and after stimulation with LPS and proportion of cell subsets in PBMC was likely partly responsible for this. However, gene expression changes after stimulation with LPS were much greater than the variability from any source, either individually or combined. CONCLUSIONS: Variability in gene expression was very low and likely to improve further as technical advances are made. The finding that stimulation with LPS has a markedly greater effect on gene expression than the degree of variability provides confidence that microarray-based studies can be used to detect changes in gene expression of biological interest in infectious diseases

    NuGO contributions to GenePattern

    Get PDF
    NuGO, the European Nutrigenomics Organization, utilizes 31 powerful computers for, e.g., data storage and analysis. These so-called black boxes (NBXses) are located at the sites of different partners. NuGO decided to use GenePattern as the preferred genomic analysis tool on each NBX. To handle the custom made Affymetrix NuGO arrays, new NuGO modules are added to GenePattern. These NuGO modules execute the latest Bioconductor version ensuring up-to-date annotations and access to the latest scientific developments. The following GenePattern modules are provided by NuGO: NuGOArrayQualityAnalysis for comprehensive quality control, NuGOExpressionFileCreator for import and normalization of data, LimmaAnalysis for identification of differentially expressed genes, TopGoAnalysis for calculation of GO enrichment, and GetResultForGo for retrieval of information on genes associated with specific GO terms. All together, these NuGO modules allow comprehensive, up-to-date, and user friendly analysis of Affymetrix data. A special feature of the NuGO modules is that for analysis they allow the use of either the standard Affymetrix or the MBNI custom CDF-files, which remap probes based on current knowledge. In both cases a .chip-file is created to enable GSEA analysis. The NuGO GenePattern installations are distributed as binary Ubuntu (.deb) packages via the NuGO repository

    Identifying differential exon splicing using linear models and correlation coefficients

    Get PDF
    Background: With the availability of the Affymetrix exon arrays a number of tools have been developed to enable the analysis. These however can be expensive or have several pre-installation requirements. This led us to develop an analysis workflow for analysing differential splicing using freely available software packages that are already being widely used for gene expression analysis. The workflow uses the packages in the standard installation of R and Bioconductor (BiocLite) to identify differential splicing. We use the splice index method with the LIMMA framework. The main drawback with this approach is that it relies on accurate estimates of gene expression from the probe-level data. Methods such as RMA and PLIER may misestimate when a large proportion of exons are spliced. We therefore present the novel concept of a gene correlation coefficient calculated using only the probeset expression pattern within a gene. We show that genes with lower correlation coefficients are likely to be differentially spliced.Results: The LIMMA approach was used to identify several tissue-specific transcripts and splicing events that are supported by previous experimental studies. Filtering the data is necessary, particularly removing exons and genes that are not expressed in all samples and cross-hybridising probesets, in order to reduce the false positive rate. The LIMMA approach ranked genes containing single or few differentially spliced exons much higher than genes containing several differentially spliced exons. On the other hand we found the gene correlation coefficient approach better for identifying genes with a large number of differentially spliced exons.Conclusion: We show that LIMMA can be used to identify differential exon splicing from Affymetrix exon array data. Though further work would be necessary to develop the use of correlation coefficients into a complete analysis approach, the preliminary results demonstrate their usefulness for identifying differentially spliced genes. The two approaches work complementary as they can potentially identify different subsets of genes (single/few spliced exons vs. large transcript structure differences)

    Data analysis issues for allele-specific expression using Illumina's GoldenGate assay.

    Get PDF
    BACKGROUND: High-throughput measurement of allele-specific expression (ASE) is a relatively new and exciting application area for array-based technologies. In this paper, we explore several data sets which make use of Illumina's GoldenGate BeadArray technology to measure ASE. This platform exploits coding SNPs to obtain relative expression measurements for alleles at approximately 1500 positions in the genome. RESULTS: We analyze data from a mixture experiment where genomic DNA samples from pairs of individuals of known genotypes are pooled to create allelic imbalances at varying levels for the majority of SNPs on the array. We observe that GoldenGate has less sensitivity at detecting subtle allelic imbalances (around 1.3 fold) compared to extreme imbalances, and note the benefit of applying local background correction to the data. Analysis of data from a dye-swap control experiment allowed us to quantify dye-bias, which can be reduced considerably by careful normalization. The need to filter the data before carrying out further downstream analysis to remove non-responding probes, which show either weak, or non-specific signal for each allele, was also demonstrated. Throughout this paper, we find that a linear model analysis of the data from each SNP is a flexible modelling strategy that allows for testing of allelic imbalances in each sample when replicate hybridizations are available. CONCLUSIONS: Our analysis shows that local background correction carried out by Illumina's software, together with quantile normalization of the red and green channels within each array, provides optimal performance in terms of false positive rates. In addition, we strongly encourage intensity-based filtering to remove SNPs which only measure non-specific signal. We anticipate that a similar analysis strategy will prove useful when quantifying ASE on Illumina's higher density Infinium BeadChips.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Messina: A Novel Analysis Tool to Identify Biologically Relevant Molecules in Disease

    Get PDF
    BACKGROUND: Morphologically similar cancers display heterogeneous patterns of molecular aberrations and follow substantially different clinical courses. This diversity has become the basis for the definition of molecular phenotypes, with significant implications for therapy. Microarray or proteomic expression profiling is conventionally employed to identify disease-associated genes, however, traditional approaches for the analysis of profiling experiments may miss molecular aberrations which define biologically relevant subtypes. METHODOLOGY/PRINCIPAL FINDINGS: Here we present Messina, a method that can identify those genes that only sometimes show aberrant expression in cancer. We demonstrate with simulated data that Messina is highly sensitive and specific when used to identify genes which are aberrantly expressed in only a proportion of cancers, and compare Messina to contemporary analysis techniques. We illustrate Messina by using it to detect the aberrant expression of a gene that may play an important role in pancreatic cancer. CONCLUSIONS/SIGNIFICANCE: Messina allows the detection of genes with profiles typical of markers of molecular subtype, and complements existing methods to assist the identification of such markers. Messina is applicable to any global expression profiling data, and to allow its easy application has been packaged into a freely-available stand-alone software package

    Transcriptome analyses of mouse and human mammary cell subpopulations reveal multiple conserved genes and pathways

    Get PDF
    INTRODUCTION: Molecular characterization of the normal epithelial cell types that reside in the mammary gland is an important step toward understanding pathways that regulate self-renewal, lineage commitment, and differentiation along the hierarchy. Here we determined the gene expression signatures of four distinct subpopulations isolated from the mouse mammary gland. The epithelial cell signatures were used to interrogate mouse models of mammary tumorigenesis and to compare with their normal human counterpart subsets to identify conserved genes and networks. METHODS: RNA was prepared from freshly sorted mouse mammary cell subpopulations (mammary stem cell (MaSC)-enriched, committed luminal progenitor, mature luminal and stromal cell) and used for gene expression profiling analysis on the Illumina platform. Gene signatures were derived and compared with those previously reported for the analogous normal human mammary cell subpopulations. The mouse and human epithelial subset signatures were then subjected to Ingenuity Pathway Analysis (IPA) to identify conserved pathways. RESULTS: The four mouse mammary cell subpopulations exhibited distinct gene signatures. Comparison of these signatures with the molecular profiles of different mouse models of mammary tumorigenesis revealed that tumors arising in MMTV-Wnt-1 and p53-/- mice were enriched for MaSC-subset genes, whereas the gene profiles of MMTV-Neu and MMTV-PyMT tumors were most concordant with the luminal progenitor cell signature. Comparison of the mouse mammary epithelial cell signatures with their human counterparts revealed substantial conservation of genes, whereas IPA highlighted a number of conserved pathways in the three epithelial subsets. CONCLUSIONS: The conservation of genes and pathways across species further validates the use of the mouse as a model to study mammary gland development and highlights pathways that are likely to govern cell-fate decisions and differentiation. It is noteworthy that many of the conserved genes in the MaSC population have been considered as epithelial-mesenchymal transition (EMT) signature genes. Therefore, the expression of these genes in tumor cells may reflect basal epithelial cell characteristics and not necessarily cells that have undergone an EMT. Comparative analyses of normal mouse epithelial subsets with murine tumor models have implicated distinct cell types in contributing to tumorigenesis in the different models
    corecore